Agentic Browser

Documentation

Back to Home
Home Projects Agentic Browser Browser Automation Security And Safety Mechanisms

Security And Safety Mechanisms

Table of Contents#

  1. Introduction

  2. Project Structure

  3. Core Components

  4. Architecture Overview

  5. Detailed Component Analysis

  6. Dependency Analysis

  7. Performance Considerations

  8. Troubleshooting Guide

  9. Conclusion

  10. Appendices

Introduction#

This document explains the security and safety mechanisms implemented in the browser automation system. It focuses on:

  • User approval workflow for potentially dangerous actions

  • Activity logging and audit trails

  • Intelligent content filtering

  • Agent sanitizer role in preventing malicious inputs and prompt injection validation

  • Action approval processes

  • Security boundaries between content scripts and page context

  • Permission management and safe execution environments

  • Examples of security policies, threat mitigation strategies, and incident response procedures

  • Compliance considerations and best practices for secure browser automation

Project Structure#

The security-relevant parts of the system span three layers:

  • Extension background and content scripts for safe DOM/tab operations

  • Utilities for sanitization and validation

  • Frontend settings and authentication for secure credential handling

graph TB subgraph "Extension Layer" BG["Background Script
background.ts"] CS["Content Script
content.ts"] UI["Side Panel UI
UnifiedSettingsMenu.tsx"] end subgraph "Utilities" SAN["Agent Sanitizer
agent_sanitizer.py"] PINJ["Prompt Injection Validator
prompt_injection_validator.py"] end subgraph "Execution" EXE["Action Executor
executeActions.ts"] end subgraph "Permissions" CFG["Manifest Permissions
wxt.config.ts"] end BG --> CS BG --> EXE EXE --> CS UI --> BG SAN --> BG PINJ --> BG CFG --> BG CFG --> CS

Diagram sources

Section sources

Core Components#

  • Agent Sanitizer: Validates and sanitizes action plans from the LLM, rejects unsafe constructs, and enforces required fields per action type.

  • Prompt Injection Validator: Provides a template to detect prompt injection attempts in markdown content.

  • Background Script: Orchestrates safe execution of tab-level and DOM-level actions, injects content scripts when needed, and coordinates messaging with the active tab.

  • Content Script: Executes DOM operations within the page context under strict selectors and event dispatching.

  • Action Executor: Translates high-level actions into browser APIs with minimal delay and error handling.

  • Manifest Permissions: Defines the minimal set of permissions required for safe automation.

  • Side Panel UI and Authentication: Manages secure storage of credentials and API keys, and handles OAuth flows.

Section sources

Architecture Overview#

The system separates concerns across layers to enforce security boundaries:

  • Background script controls browser-level actions and safe injection of content scripts.

  • Content script operates within the page’s DOM with explicit selectors and event simulation.

  • Utilities validate inputs and actions before execution.

  • UI manages sensitive data and authentication securely.

sequenceDiagram participant UI as "Side Panel UI
UnifiedSettingsMenu.tsx" participant BG as "Background Script
background.ts" participant CS as "Content Script
content.ts" participant TAB as "Target Tab" UI->>BG : Request action execution BG->>BG : Validate action plan via sanitizer BG->>CS : Inject content script if needed BG->>TAB : Send EXECUTE_ACTION message TAB-->>CS : Receive action payload CS->>CS : Find element by selector and simulate events CS-->>TAB : DOM mutation completes CS-->>BG : Return result BG-->>UI : Report outcome

Diagram sources

Detailed Component Analysis#

Agent Sanitizer#

The sanitizer validates JSON action plans and enforces:

  • Required fields per action type (e.g., selector for CLICK/TYPE/SELECT, url for OPEN_TAB/NAVIGATE, tab identifier or direction for SWITCH_TAB)

  • Structural checks (presence of actions array, non-empty list)

  • Safety checks for EXECUTE_SCRIPT against known dangerous patterns

  • Backward-compatible legacy JS validation

flowchart TD Start(["Sanitize JSON Actions"]) --> Strip["Remove code fences
and normalize text"] Strip --> Parse{"Parse JSON"} Parse --> |Fail| Err["Return problems:
Invalid JSON"] Parse --> |Success| CheckActions["Validate 'actions' array"] CheckActions --> ForEach["For each action"] ForEach --> TypeCheck{"Type valid?"} TypeCheck --> |No| AddProblem["Record invalid type"] TypeCheck --> |Yes| FieldChecks["Field validation per type"] FieldChecks --> ExecScript{"EXECUTE_SCRIPT?"} ExecScript --> |Yes| Safety["Scan for dangerous patterns"] ExecScript --> |No| NextAction["Next action"] Safety --> NextAction AddProblem --> NextAction NextAction --> ForEach ForEach --> Done{"All validated?"} Done --> |Yes| ReturnOK["Return data + empty problems"] Done --> |No| ReturnErr["Return data + problems"]

Diagram sources

Section sources

Prompt Injection Validator#

The validator defines a structured prompt template to classify whether a markdown text is safe or contains prompt injection attempts. It expects a binary classification response suitable for automated gating.

flowchart TD A["Receive markdown text"] --> B["Wrap with validator template"] B --> C["Send to LLM for classification"] C --> D{"true/false"} D --> |true| E["Safe content"] D --> |false| F["Flag for review or block"]

Diagram sources

Section sources

Background Script: Safe Execution Engine#

The background script coordinates:

  • Tab/window control actions (OPEN_TAB, CLOSE_TAB, SWITCH_TAB, NAVIGATE, RELOAD_TAB, DUPLICATE_TAB)

  • DOM manipulation actions (CLICK, TYPE, SCROLL, WAIT) via content script injection

  • Message routing and result aggregation

  • Waiting for navigation/reload completion with timeouts

sequenceDiagram participant BG as "Background Script" participant CS as "Content Script" participant TAB as "Target Tab" BG->>BG : handleRunGeneratedAgent(action_plan) loop For each action BG->>CS : executeScript(func,args) or sendMessage(PERFORM_ACTION) alt DOM action CS->>CS : Query selector and dispatch events else Tab action BG->>TAB : tabs.create/update/remove/duplicate end CS-->>BG : Result BG-->>BG : Aggregate results end BG-->>Caller : Final report

Diagram sources

Section sources

Content Script: Page Context Operations#

The content script executes DOM operations safely:

  • Finds elements by selector

  • Dispatches realistic input/change/keyboard events for editable and standard inputs

  • Scrolls and interacts with page elements

flowchart TD Start(["performAction(action)"]) --> Parse["Normalize and parse action"] Parse --> Click{"CLICK?"} Click --> |Yes| FindEl["querySelector(selector)"] FindEl --> Exists{"Element exists?"} Exists --> |No| Err["Throw error"] Exists --> |Yes| ClickEvt["Dispatch click"] ClickEvt --> Done["Return success"] Parse --> Type{"TYPE?"} Type --> |Yes| FindInput["querySelector(selector)"] FindInput --> Editable{"Editable?"} Editable --> |Yes| SetText["Set innerText/textContent"] Editable --> |No| SetVal["Set value"] SetText --> Events["Dispatch input/change/keydown/keyup"] SetVal --> Events Events --> Done

Diagram sources

Section sources

Action Executor: Minimal Bridge Between UI and Browser APIs#

The executor translates high-level actions into browser APIs with:

  • Targeting the active tab for DOM actions

  • Sending messages to the content script for DOM operations

  • Introducing small delays between actions to avoid overwhelming the page

flowchart TD Start(["executeBrowserActions(actions)"]) --> Loop["For each action"] Loop --> Type{"Action type?"} Type --> |OPEN_TAB| Open["Create tab with url"] Type --> |CLICK/TYPE| Active["Find active tab id"] Active --> Msg["sendMessage(EXECUTE_ACTION)"] Open --> Delay["Wait 500ms"] Msg --> Delay Delay --> Next["Next action"] Next --> Loop Loop --> End(["Done"])

Diagram sources

Section sources

Permissions and Security Boundaries#

The manifest grants minimal permissions necessary for automation:

  • Tabs, scripting, storage, identity, side panel, webNavigation, webRequest, cookies, bookmarks, history, clipboard, notifications, context menus, downloads

  • Host permissions for all URLs to enable page context operations

graph LR BG["Background Script"] --> Tabs["Tabs API"] BG --> Scripting["Scripting API"] BG --> Storage["Storage API"] BG --> Identity["Identity API"] BG --> SidePanel["Side Panel API"] BG --> WebNav["Web Navigation API"] BG --> WebReq["Web Request API"] BG --> Cookies["Cookies API"] BG --> Bookmarks["Bookmarks API"] BG --> History["History API"] BG --> Clipboard["Clipboard API"] BG --> Notifications["Notifications API"] BG --> CtxMenu["Context Menus API"] BG --> Downloads["Downloads API"] BG --> AllUrls[" Host Permissions"]

Diagram sources

Section sources

Secure Credential Management and Authentication#

The side panel UI and authentication hook:

  • Store API keys and credentials securely in browser storage

  • Manage OAuth flows with explicit consent and token lifecycle

  • Provide visibility into token status and expiry

sequenceDiagram participant UI as "UnifiedSettingsMenu.tsx" participant Auth as "useAuth.ts" participant Storage as "Browser Storage" participant Google as "Google OAuth" UI->>Auth : handleLogin() Auth->>Google : Launch web auth flow Google-->>Auth : Authorization code Auth->>Auth : Exchange code for tokens Auth->>Storage : Save user + tokens Storage-->>UI : Updated token status

Diagram sources

Section sources

Dependency Analysis#

The security-critical dependencies are:

  • Background script depends on content script for DOM operations

  • Sanitizer and validator feed into background action orchestration

  • UI depends on background for executing actions and on storage for credentials

  • Manifest permissions enable safe automation boundaries

graph TB SAN["agent_sanitizer.py"] --> BG["background.ts"] PINJ["prompt_injection_validator.py"] --> BG BG --> CS["content.ts"] BG --> EXE["executeActions.ts"] UI["UnifiedSettingsMenu.tsx"] --> BG AUTH["useAuth.ts"] --> UI CFG["wxt.config.ts"] --> BG CFG --> CS

Diagram sources

Section sources

Performance Considerations#

  • Artificial delays between actions reduce page overload and improve stability.

  • Waiting for navigation/reload completion prevents race conditions.

  • Minimal content script injection reduces overhead.

  • Event dispatching simulates realistic user interactions to minimize detection.

[No sources needed since this section provides general guidance]

Troubleshooting Guide#

Common issues and mitigations:

  • Element not found during CLICK/TYPE: Verify selector specificity and timing; ensure content script runs after page load.

  • Navigation failures: Confirm URL validity and allow sufficient completion time.

  • Sanitizer rejects action plan: Review required fields and action types; remove dangerous patterns for EXECUTE_SCRIPT.

  • Authentication errors: Re-run OAuth flow and confirm backend connectivity.

Section sources

Conclusion#

The system enforces strong security boundaries by validating inputs, limiting permissions, and isolating DOM operations to content scripts. The background script orchestrates safe actions, while the UI manages credentials securely. Together, these components provide a robust foundation for secure browser automation with logging, filtering, and approval processes.

[No sources needed since this section summarizes without analyzing specific files]

Appendices#

Security Policies and Best Practices#

  • Enforce user approval for all potentially destructive actions (OPEN_TAB, NAVIGATE, TYPE, CLICK).

  • Maintain comprehensive activity logs for every action with timestamps and outcomes.

  • Apply intelligent content filtering using prompt injection validators and sanitizer rules.

  • Limit permissions to the minimum required for automation.

  • Use secure storage for credentials and tokens; avoid exposing secrets in logs or UI.

  • Implement timeouts and retries for navigation and reload operations.

  • Regularly audit action plans and runtime logs for anomalies.

[No sources needed since this section provides general guidance]

Threat Mitigation Strategies#

  • Reject unknown action types and missing fields.

  • Block EXECUTE_SCRIPT with dangerous patterns.

  • Validate URLs for OPEN_TAB/NAVIGATE.

  • Use selectors strictly and avoid broad DOM queries.

  • Simulate realistic user events to reduce fingerprinting risk.

Section sources

Incident Response Procedures#

  • Isolate affected tabs and revoke tokens if compromise suspected.

  • Review logs for suspicious action sequences and sanitize inputs.

  • Rotate API keys and re-authenticate users.

  • Notify administrators and document the incident timeline.

[No sources needed since this section provides general guidance]